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Abstract 

We propose a novel Bayesian model selection technique on linear mixed-effects mod¬ 
els to compare multiple treatments with a control. A fully Bayesian approach is imple¬ 
mented to estimate the marginal inclusion probabilities that provide a direct measure 
of the difference between treatments and the control, along with the model-averaged 
posterior distributions. Default priors are proposed for model selection incorporating 
domain knowledge and a component-wise Gibbs sampler is developed for efficient pos¬ 
terior computation. We demonstrate the proposed method based on simulated data 
and an experimental dataset from a longitudinal study of mouse lifespan and weight 
trajectories. 


1 



1 Introduction 


Experiments are run by researchers in biology, medicine, and various other scientihc fields, 
to compare multiple treatments with a control or standard treatment. Often these studies 
are conducted over a period of time and result in unbalanced repeated measured data that 
is commonly analyzed by the linear mixed-effects model (LMM). The LMM allows for some 
subsets of the regression parameters to vary among subjects, thereby accounting for sources 
of natural heterogeneity in the population. It models the mean response as a combination 
of population characteristics (fixed-effects) that are assumed to be shared by subjects, and 
subject-specific characteristics (random-effects) that are unique to a particular subject. 
It is common to introduce a set of fixed-effects for each group to model the effect of the 


treatment (see e.g. Fitzmaurice et al., 2004). To compare treatment groups with the control 
groups is, therefore, equivalent to comparing the sets of fixed-effects. Researchers are often 
interested in deciding which treatments are different from the control, and measuring the 
corresponding significance of the discrepancy. 

Standard model selection procedures can be implemented to answer these questions 


(see e.g. Bolker et al. 

2009 

Fitzmaurice et al. 

2004 

) with certain limitations. One can 

select models by using hypothesis tests ( 

Stephens et al. 

2005 

); that is, test simpler nested 


models against more complex models and report corresponding p-values. Although the 
likelihood ratio test (LRT) is widely used to determine the contribution of a factor in 
a model throughout statistics, it is not recommended by Pinheiro and Bates (2006) for 
testing fixed-effects in LMM, because of its unreliability for small to moderate sample 
size. Also, when the focus is to compare multiple treatments to a control, [Burnham and 


Anderson (2002) criticize that such a pairwise comparison as an abuse of hypothesis testing. 


Another extensively used approach is the information-theoretic model selection procedure 


that allows comparison of multiple models (see e.g. Burnham and Anderson, 2002). This 
method relies on information criteria, such as Akaike information criterion and Bayesian 
information criterion (BIG), that use deviance as a measure of fit with a penalization on 
more complex models. Instead of reporting p-values, it estimates the magnitude of difference 
between models in expected predictive power and uses this to make a decision as to whether a 
variable should be included in the model or not. The resulting dichotomous decisions overly 
simplify the problem, and, in our case, withhold important information on the magnitude 
of the difference between a treatment and the control. 

Motivated by these practical challenges faced by frequentist approaches, we resort to 


Bayesian model selection techniques (for a review see e.g. Clyde and George, 2004;[George 


and McGulloch 1997; Kuo and Mallick, 1998). In the Bayesian framework, this problem 


can be transformed to the form of parameter estimation (O’Hara et ah, 2009). That is. 
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estimating the marginal posterior probability that a variable should be in the model, i.e. the 
marginal inclusion probability, which can usually be calculated directly from the posterior 
inference using an Markov chain Monte Carlo (MCMC) simulation. 

There is an extensive literature on Bayesian model selection. [George and McCulloch 


(1993); Geweke et al. (1996) develop the stochastic search variable selection (SSVS) tech¬ 
nique for linear regression models that uses a Gibbs sampler to traverse the model space. 


Smith and Kohn 

1996) 

extend its application to nonparametric regression models and show 

how integrating the regression parameters is essential to reliable convergence of a Gibbs sam- 

pier. 

Kohn et al 

(2001 

) propose a more efficient single-site Metropolis-Hastings sampler. 

Holmes et al. 

(2002 

) consider selection and smoothing for a series of seemingly unrelated 


regressions. Chen and Dunson (2003); Kinney and Dunson (2007) develop variable selection 
for both fixed and random effects in generalized LMM. Recently, Bayesian model selection 
methods are extended to a series of spatially linked regression for functional magnetic reso¬ 


nance imaging analysis (see e.g. Lee et al. 2014 Smith and Fahrmeir, 2007). However, we 
are unaware of any work to extend Bayesian model selection on LMM to compare multiple 
treatments with a baseline. 

In this article, we develop a novel Bayesian model selection approach on LMM that ac¬ 
commodates and compares multiple treatment effects. The method includes a re-parameterization 
of the fixed-effects of each treatment that attributes part of the effect to a baseline for di¬ 
rect measure of the difference between a treatment and the control. A modihcation of 


the fractional prior (Smith and Kohn, 1997) is proposed to undertake model selection and 
averaging, which is also related to Zellner’s g-prior (Zellner 1986). The proposed prior 
incorporates information on subjects within the same group, which is critical to developing 
an efficient component-wise Gibbs sampler. This Bayesian paradigm provides practitioners 
with an intuitive understanding of the significance of each treatment through the marginal 
inclusion probability, which is unaccessible using existing techniques. 

Our work is motivated by a longitudinal experiment of mouse lifespan and weight tra¬ 
jectories (see e.g. Spindler et al.| 2014a||b 2013a b, 2014c) that aims to study how different 
treatments affect lifetime weight trajectories and identify potential longevity therapeutics. 
This application provides both a clear demonstration of our approach, and an example of 
enabling researchers to obtain previously unavailable information. However, the method 
itself is more general and applicable to most experiments that are interested in comparing 
multiple treatments to a control. 
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1.1 Experimental Data 


The experimental data is from a longitudinal study of the lifespan and the weight trajectories 
of an FI hybrid mouse (see e.g. Spindler et ah, 2014a b, 2013a|b 2014c). The study is part 
of a compound screening program designed to identify potential longevity therapeutics, and 
it was approved by the Institutional Animal Care and Use Committee at the University of 
California, Riverside. It utilized an unbalanced statistical design to compare the lifespan 
and the weight trajectories of multiple treatment groups to that of one larger control group 
(Jeske et ah, 2014). The disposition of dietary calories between body weight and metabolic 
energy appears to be a key to lifespan determination. In this article, we use a part of the 
dataset that recorded mouse body weight changes during the course of the experiment. 


In the study, 2266 male C3B6F1 mice were initially fed a chow diet ad libitum. At 12 
months of age (Day 365), 297 mice were shifted to daily feeding with 13.3 kcal/day/mouse 
of the control diet (Diet No.99), and the rest were shifted to control diet supplemented with 
one of 56 chemical, pharmaceutical, or nutraceuticals agents or combination of agents. All 
mice were fed daily and weighted bimonthly, but the number of mice progressively declined 
as the study progressed due to the onset of various age-related pathologies. The data are 
censored at extreme old age (Day 1369), when less than 1% of the mice remained. 

The control and drug-treated mice gradually lost weight after the shift to the defined 
diets, which provided about 10% less than the ad libitum number of calories, to ensure the 
mice consumed all their food. Our main interest is to determine which supplemented diets 
significantly affected the lifetime weight trajectories. That is, researchers are interested in 
whether any deviation from the trajectory of the control group (Diet No.99) is statistically 
significant and is caused by dietary additions. 

The rest of the paper is organized as follows. Section [^formally introduces the Bayesian 
variable selection methodology. It outlines the re-parameterization of a LMM, prior speci¬ 
fication, MCMC sampling schemes, and stopping criterion utilized. A simulation study is 
also detailed to evaluate the performance of the proposed method. Section contains the 
empirical results from the analysis of the motivating example. Section concludes with a 
discussion. 


2 Model Selection on Linear Mixed-effects Models 

In general, suppose that we have n subjects from G experimental groups under study, each 
with Ui observations taken repeatedly over time, i = 1, • • • , n, and let y* = (yi,i, • • • , 
denote the response vector for the i-th subject. Assume the i-th subject is from the y-th 


4 














group, for i = 1, • • • , n, 5 = 1, • • • , G, let Xj and Zi be two rii x p design matrices, then a 
LMM (]Fitzmaurice et aT 2004 McCullagh and Nelder, 1989) is denoted as 


Hi — ^iOtg + Zjbi -|- 6j, ~ ^ 


( 1 ) 


where ag = (ag,o, • • • j are the fixed effects shared by subjects in the g-ih experi¬ 
mental group. Further, denote bi = • • • , ~ -^p(0, as the random effects 

that are unique to the z-th subject, and hence we allow subject specific trajectories. 

Note that, among the G groups, there is one control group and G — 1 treatment groups. 
Without loss of generality, let us assume the G-th group is the control group, and g = 
1, • • • , G — 1 are the treatment groups. A primary goal for many experiments is to determine 
which alternative treatments significantly differ from the control group. To this end, we 
propose a re-parameterization of the fixed effects ot^’s in Q, = 1, • • • G. Let Wi, Xi and Zi 
be three Ui x p design matrices, the re-parameterized model is denoted as, for i = 1, • • • , re, 
g = 1, - ■ ■ , G, 

Vi = WiaXiPg + Zibi + ei, Ci ^ Nn-{0,a'^I), (2) 

where bi = {bi^Q,--- ~ Np{0, are the random effects as in Q, and o: = 

(ao, • • • , Op-i)^ are the fixed effects of the control group, 0g = {Pgfl, • • • , are the 

fixed effects modeling the difference between the fl'-th group and the control group. That 
is, the group effect ag in Q is re-written as a + 0g in Q, for = 1, • • • , G. Also, it is 
straightforward to see, as the baseline, Pg = (0, • • • , 0)^ for the control group. 

Under the re-parameterization, the detection of significant treatments is equivalent to 
the identification of nonzero jSg’s. To this end, we introduce 0/1 binary indicators 7 g = 
( 73 , 0 , • • • , 79 ,p-i)^, 5 = 1, • • • , G, such that /3gj = 0 if = 0 and /3gj / 0 if jgj = 1. 
The jgj is used to indicate whether the fixed effect on the j-th predictor of the g'-th group 
differs from that fixed effect of the control group. Given let ^gi'ig) be the vector of 
nonzero fixed effects and Xi{'yg) be the corresponding design matrix. Then, the model Q 
can be written as, for z = 1, • • • , re, g = 1, • • • , G, 


Ui Wia Xil^g^Pg{^g') Zibi €■% ■ 


(3) 


This formulation allows us to look at the problem from the Bayesian SSVS perspective 


(George and McGulloch, 1993). The SSVS searches for models having high posterior prob¬ 
ability by traversing the model space using MGMC techniques, and, thus, identifies subsets 
of predictors with nonzero coefficients. Moreover, it allows us to calculate the posterior 
distributions of the parameters by marginalizing over the other variables. In this way, the 
marginal inclusion probability can be obtained as a direct measure of the significance of 
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each treatment. 

Note that, (§ is a very general setting that is applicable to a wide range of applications. 
It is possible to impose specific structures on 7 to suit different scenarios to further simplify 
the modeling procedure. For example, given the setups of the motivating experiment, it is 
reasonable to assume a common intercept for all groups, since the mice were on the same 
diet at the first measurement; that is, (5gfl = 0, for <7 = 1, • • • , G. Therefore, given that 
the primary goal is to compare treatment groups to the baseline group G, it is desirable to 
impose the following settings on 7 , 

7i,o = • • • = 7g-i,o = 7g,o = 0, 

7g = (7G,o, • • • , 7G.p-i)'^ = (0, • • • , 0)"^. 


2.1 Prior Specification 

A proper prior must be placed on the nonzero coefficients Pgi'yg) to undertake model aver¬ 


aging (see e.g. George and McCulloch 1993; Kohn et al. 2001; Mitchell and Beauchamp 


1988; [Smith and Kohn 

1996 

). In particular. 

Kohn et al. 

( 2001 ); 

Smith and Fahrmeir 


(2007) propose a conditional prior for the coefficients by setting it proportional to a frac¬ 


tion of the likelihood. This fractional prior is related to the g-prior in Zellner (1986), 


and is located and scaled in line with the information from the likelihood. We propose 
a modification of this idea to accommodate multiple subjects within a group by setting 
{Pg{'yg)\y,Oi,'rg,b,a^) (xUifzg p{yi\a,Pg{'yg),'yg,bi,a'^)^^'"\ so that 




Pg{'rg)\y,a,'rg,b,a‘^ -- N \^g{')g),a‘^ \ {')g)Xi{')g) 

where ^g( 7 g) = {'^g)Xi{'ig)) 


( 4 ) 


(Eies ^'^ 9 ) (.Vi - - Zibi)), and 

stands for summation over all the subjects that belong to the ^-th group. 


This prior is proportional to the variance of the least squares estimate of /3, and enjoys 
a number of attractive properties as pointed out by Kohn et al. (2001). The prior @ 
is rescaled automatically if the design matrix X or the data y is rescaled because of its 
structure and the presence of ci^. Moreover, this prior is invariant to location changes in X 
and y given the basis term (1, • • • , 1)^ is included in X. Also it is data-based since /Sg( 7 g) 
depends on y, which allows proper centering of /3. 

We consider the prior on 7 to be vr( 7 g| 7 rg) = 0^=0 ^( 79 dl %))5 = I)''' jG'j where 
^(797 Kg) Bernoulli{'Kg) and tt = (tti,-- - , ttg)^ is a vector of hyper-parameters that 
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represents prior knowledge for every groups. Intuitively, vr^ is the probability that re¬ 
searchers believe the g-th group is significantly different from the control group before 
conducting the experiment. For instance, we find a sensible setting, when there is little 
prior knowledge of the effects of the treatments, to be letting ttq = 0 for the control group, 
and TTi = • • • = ttg-i = 0.5 for the G — 1 treatment groups. We assume standard priors in 


Bayesian hierarchical models (see e.g. Gelman et al., 2004; Johnson and Jones, 2010; Smith 
and Kohn, 1996) for the rest of the parameters, i.e. a, b, 


aids, di ~ Np(ds, 




bilXo ~ iVp(0, i = !,■■■ , n 

XD\di,d2 ~ r(di, ^ 2 ) 

7r((T^) OC 1/(7^ 

where di, d 2 ,ds, d^ are hyper-parameters to be specified. 


2.2 Posterior Inference 


Combining the priors and likelihoods, the full joint posterior density for 6 = (a, /3, 7 , b, cr^, Xd) 
is characterized by 


q{a,l3,-f,b,a^,XD\y) oc 


' G 

n 

Wp{yi\a,Pg,'ig,bi,a‘^)'K{bi\XD) 

^ 7r(7g) 

9=1 

i&g 



X 7r(Q!)7r(Ai:))7r((T^). 


(5) 


This distribution has a complex form which we cannot sample from directly; instead, 
we resort to MCMC methodology for the posterior inference and employ a component-wise 


strategy (Johnson et ah, 2013). Specifically, we propose a component-wise Gibbs sampler 
for posterior computation. To this end, we need the full conditional posterior distributions 
of each of the parameters in 9 to update the Markov chain. The derivation of the full 
conditional posterior distributions follows from ([^ using straightforward algebraic route. 


Schematically, we can set up a six-variable component-wise Gibbs sampler; that is, if we 
let 9 = ( 7 , /3,q:, 6 , A^)) be the current state and 9' = ( 7 ',/?', a', {a^)',b',X!j^) be the future 

state, we iteratively sample from the full conditional posterior distributions to update the 
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chain 


(7,/3,a,cT^6,AD) ^ {'j', I3,a,a^,b, Xd) (7',/3',q:,ct^,6, Ad) ^ ( 7 ',/3',q:', cj^ 6, Ad) 

-5” {l',P',oc',{a‘^y,b,XD) (7',/3',Q!',((T^)',6',Ad) ( 7 ,/3',Q!', (cr^)', 6', A'd)- 

Step 1. The transition 7 —>■ 7 ^ consists of G x p steps, 

(71,0,71,1) • • • ,7i,p-i, • • • ,7G,o, • • • ,7G,p-i) ( 7 ^, 0 ,71,1)''' ,7i,p-i,''' ,7G,o, • • • ,7G,p-i) 

( 7 ^, 0 ,7^,1) • • • ,7i,p-i, • • • ,7G,o, • • • ,7G,p-i) 


( 7 ^, 0 ,7^,1) •• • ,7^,p-i,--- ,7^G,o,-- - ,7^G,p-i)- 


From the Appendix, we have, for s' = 1, • • • ,G and j = 0, • • • ,p — 1, 


I rij -V^(7»)-V(7j)l i 
+ i)vr( 7 ,)V,( 7 »)l ) 


X exp 


1 


iGg ^ i^g ^ ^ ^ i^g ^ ^ ^ i^g ^ 

E(i + ;!:)V"’(7,)^y(E(i + ;!:)F’(7,)Xi(7,)) YE(i + Y>E(7,¥, 


i&g 


i&g 


i&g 


( 6 ) 


where 7-(p,i) = ( 75 , 0 , • • • , 7p,i-i, 75 ,i+i,''' , 7 p,p-i)^ and <j)i = yi - Wia - Zibi. 

At each step, an update is simulated from 'jgj Qh 9 ,Mn-{g,j),b,a^,y). Since 7^7 is 
binary, i.e. ^gj G { 0 , 1 }, the conditional posterior distribution Q{'yg,j\oi,'y-{g,j), b, y) 
is easily normalized by evaluating Q for ■ygj = 0 and ^gj = 1 . 

Step 2. The transition fi ^ 0 consists of G steps, 

(/3i,^2, • • • ,^g) { 01 , 02 , ■ ■ ■ , 0 g) 

- , 0 g) 

- , 0 g'). 


At each step, an update is simulated from a p-dimensional multivariate normal dis- 





tribution, 


1 


~ N^p-l I V^ 


-1 


■'1 




Y,Xj{^g){y^-W^a-ZM 


i£g 




-1 


(7) 


where Vi = ^ Eiegi'^ + ^)XTilgWlg). 


Step 3. Consider updating a where the update is simulated from a p-dimensional multivariate 
normal distribution, 

a'~ q{a\l3,-f,b,a‘^,y) 


r^Np V 2 


-1 


G 




g=l \ ieg 
\ T 


E E U* - X,{')g)^g{lg) - 


+ 1 > ,- 
rii 
*eg 


E {lg)wA ( E i'rg)Mlg 


/ rii 

iSg 


E hg)iyi - Xi{'^g)Pg{')g) - Zibi) 
IT'i 
*eg 


+ ^4^3 




( 8 ) 


where 


G 




9=1 L i&g 


E »r.+(E Xr (7,)w'y(E yr(7 ,)v,(7,)) ‘ (E Ee 

i &9 ^ ieg * 7 V J \ n* 




Step 4. Consider updating cj^. At each step, an update is simulated from a Inverse-Gamma 


9 



distribution, 


g(cj^|Q!,/3,7,6,y) 


fl ^ 

Inv-Gammai -{N + EE 79 j), 

9=1 i=0 


'^(vi-Wia- Xi{jg)0g{jg) - ZA^ i - Witt - Xi{'i g) ^ g{'^ g) - Zih^ 

9=1 1 * 69 '^ ^ ^ 

fE;!:^?'(r«)^.( 7 ,)') YE; 7 :^*"(r»)*')l^fE; 7 :^.^(r»)^.(-r») 


9=1 L i&g 
+ 


ie 9 


ie 9 


ie 9 


-1 


^9-(Y1 ~r^i (79)^i(79)) ( ( 79 )'^^ 

V V- / V V- 



(9) 


where N = X]^=i Yli^g ii-*- Note that, we denote Inv-Gamma{a, j3) 
for X G (0, 00 ), and a, /? > 0. 


_ 1^°" ^—a—l 

- r(a)^ 


exp 



Step 5. The transition h ^ h' consists of n steps. 


(61,62, • • • ,6n) —^ (61Y62, • • • ,6n) 
—J" ( 61 Y 62 ^ • • • , hn) 


, 6 „'). 

At each step, assuming the i-th subject is from the ^(-th group, an update is simulated 
from a p-dimensional multivariate normal distribution, 

6 /~ q{hi\aAg,lg,cr‘^AD,yi) 




(-zr^.(7,)(E(E+ 

+ Zj (^Vi - Wia - (1 + ^)XiAg)Pg{'rg)^ 


j^g 

j¥=i 


,V3~^ , 


( 10 ) 


where F 3 = ^Z^Z, + X^I + j.^ZfX^i'rg) ( E.e, ^.Xj{^g)Xg{'^g)^ 'Xf{jg)Z,. 
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Step 6. Finally, consider updating A^- At each step, an update is simulated from a Gamma 
distribution. 


~ Qi^nlb) 


r(Y + di, + 


G 


( 11 ) 


9=1 i&9 


The posterior inference on model parameters can be carried out using the MCMC sam¬ 
ples. Models with high posterior probability can be identified as those appearing most often 
in the MCMC output. One posterior quantity of interest is the marginal inclusion proba¬ 
bility, i.e. Pr( 7 gj = l|y), g = 1, - ■ ■ ,G, j = 0, - ■ ■ ,p — 1, which can be calculated using the 
proportion of draws in which 'jgj is non-zero. It provides a direct measure of the signifi¬ 
cance of I3gj, which remains challenging for the current frequentist methods. It, therefore, 
allows researchers for straightforward understanding of the significance of each treatment. 
Also, if needed, one may classify a treatment effect such that Pr( 7 gj = l\y) > 0.8772 as 


significant or otherwise insignificant (see e.g. Lee et ah, 2014, Raftery et ah, 1996; Smith 


and Fahrmeir, 2007). 


2.3 Stopping Criterion 

Determining how long to run an MCMC simulation is critical to performing legitimate pos¬ 
terior inference. Premature termination often runs the risk of getting inaccurate estimates. 


The relative standard deviation fixed-width stopping rule (FWSR) (see e.g. Flegal and 


Gong, 2015 Gong and Flegal, 2015) is implemented to terminate the MCMC simulation. 


It is a member of the FWSR family (for e.g. see Flegal and Gong 2015; Flegal et ah, 2008 


Jones et ah, 2006). The relative standard deviation FWSR is theoretically valid in that it 


terminates a simulation w.p. 1 and the resulting confidence interval achieves the nominal 
coverage probability. Moreover, it automates the stopping procedure for practitioners and 
outperforms convergence diagnostics in various numerical studies. Interested readers are 
directed to their papers for more details. 

In short, the relative standard deviation FWSR terminates the simulation when the 
computational uncertainty is relatively small to the posterior uncertainty. Specifically, it 
controls the width of a confidence interval from a Markov chain central limit theorem 


through a threshold e and significant level 6. Gong and Flegal (2015) also establish a 


connection between the standard deviation FWSR and using effective sample size (ESS) 
as a stopping criteria, i.e. K = where K is the number of effective samples and 

zs /2 is a critical value from the standard Normal distribution. Based on this connection. 
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for instance, setting e = 0.124 and 6 = 0.05 in the relative standard deviation FWSR is 
equivalent to terminate the simulation when an ESS reaches K = 1000. 


2.4 Simulation Study 

We report the results of a simulation study undertaken to validate the model and estimation 
procedure. The simulated dataset consists of a control group and five treatment groups. 
The control group is simulated based on estimates from maximum likelihood estimation 
(MLE) of a LMM on the control group of the experimental data. That is, denote Yi^t ,99 as 
the weight of mouse i G {I,-- - ,297} from the control group (Diet No.99) taken at time 
t £ {365,395,456,517,578,639,700,760,821,882,943,1004,1065,1125,1186}, we consider 
the following LMM based on Q, 

hi,t,99 = <ao + exit + bo^i + bi^it + Ci.t ~ -^(0, cr^), (12) 


where ao and ai are the global intercept and slope, 6o,i and bi^i are the subject specific 
random effects, where bi = (5o,i,6i,i)^ ~ N 2 { 0 , I), and Cj,* is the measurement error. 

Note that, as mentioned, the /3’s in Q are set to zero for the control group to serve as the 
baseline model. 


Parameter estimation of (12) was carried out using the lmer() function in the R package 
lme4 (Bates et ah, 2012). Notice time was rescaled using t = {t — 365)/365 prior to model 
htting. The MLE estimates are a = (45.49, —5.75)^ and = 5.06, and we set = 1.0. 
Based on these estimates, we simulated 297 subjects from (12) as the control group. 

We then simulated five treatment groups, each with 36 subjects, by adding /?g,i’s to 
(12), while keeping other settings the same as for the simulated control group. 


Yi,t,g — OlQ + (Xlt + + 6o,i + bl^it + €i^t, 5 — 1; ''' j 5, 


(13) 


where /3gq G {—2.0,—0.5, 0.0,0.5, 2.0} for each group. To be consistent with the experi¬ 
mental settings, we artificially differentiate the slope of each treatment group by ai + /3g,i, 
but maintained the same global intercept ao, since all mice were on the same diet at t = 0. 
Note that, we did not incorporate the “die-ofF’ mechanism from the experiment into the 
simulation. Figureshows the mean weight trajectories for this simulated dataset. 


We followed the prior specification outlined in Section 2T The hyper-parameters di , d 2 
were set to di = 0.001,^2 = 0.001 for the prior on Ad to be vague. The hyper-parameters 
for Q! 1^3,44 were set using estimates obtained from a fitted LMM on the simulated control 
group. The prior inclusion probabilities for the treatment groups, i.e. vr^, g = 1, - ■ ■ ,5, were 
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Mean Weight Trajectories 



Figure 1: Mean weight trajectories for the 5 treatment groups and the control group. 


set to 0.5 for equal probability between inclusion and exclusion. 


The component-wise Gibbs sampler was run as described in Section 2.2 The simulation 
was terminated by the relative standard deviation FWSR with the tuning parameters e = 
0.124 and 5 = 0.05. It resulted in 16385 iterations with an effective sample size of at least 
1000 for estimation of the posterior mean of all the parameters. The resulting MCMC 
outputs show the chain is mixing well and centered near the true parameter values. 


Table 1: Fixed-effects estimates for the simulated dataset. For MLE, mean and 95% con¬ 
fidence interval (Cl) are presented. For posterior inference, posterior mean, 95% credible 
interval (Cl) and marginal inclusion probability (standard error in the parenthesis) are 
presented. 


Parameter 

Truth 


MLE 


Posterior 


Mean 

95% Cl 

Mean 

95% Cl 

Pr(7sj = My) 

ao 

45.50 

45.570 

(45.431, 45.570) 

45.591 

(45.484, 45.700) 


ai 

-5.75 

-5.708 

(-5.852, -5.565) 

-5.716 

(-5.822, -5.612) 


Pi,i 

-2.00 

-2.130 

(-2.546, -1.713) 

-2.126 

(-2.562, -1.685) 

0.992(6.10e-5) 

1^2,1 

-0.50 

-0.693 

(-1.109, -0.276) 

-0.698 

(-1.126, -0.267) 

0.983(7.44e-4) 


0.00 

-0.092 

(-0.508, 0.325) 

-0.093 

(-0.518, 0.341) 

0.442(3.88e-3) 

/34,1 

0.50 

0.708 

(0.292, 1.125) 

0.708 

(0.283, 1.136) 

0.987(5.74e-4) 


2.00 

2.266 

(1.849, 2.683) 

2.268 

(1.830, 2.695) 

0.992(6.10e-5) 


We compare our results to the estimates from the frequentist approach (see e.g. Fitzmau- 


rice et al., 2004), as it is widely used to model such problems. Researchers often combine 


LMM with certain model selection criteria, e.g. BIC, to determine which treatment are 
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significantly differ from the control (see e.g. Spindler et al. 2013a). Tablecontains the 
posterior estimates from the proposed model, along with the MLE estimates from lme4 
for the LMM. Despite that both approaches result in estimates close to the truth, our ap¬ 
proach introduces the marginal inclusion probability for each treatment group that is vital 
to straightforward interpretation and correct ranking of the significance of the treatment 
effects. If a dichotomous decision is desired, setting a threshold to the suggested value of 
0.8772, one would correctly classify Diet No.3 to be insignificant relative to the control. 

The sensitivity to the prior inclusion probabilities was also evaluated by repeating the 
simulation with tt^’s set to ranging from 0.3 to 0.7. We found no difference in model ranking, 
although the parameter estimates were slightly different. Other simulation settings showed 
comparable parameter estimations between our method and the MLEs, and correctness in 
model ranking, although the results are not shown here. 


3 Application 

In this section, we use the methodology detailed in Section to analyze the mouse body 
weight data (see Section 1.1). Out of the 56 treatment groups in the original study, we 
limited our attention to 18 pre-screened treatments that the researchers are most interested 
in, as well as the control diet (Diet No.99). The 18 groups exhibited altered weight trajecto¬ 
ries, or were related chemically to groups that did. Since all the groups consumed the same 
number of calories, weight trajectories are related to the disposition of dietary calories be¬ 
tween body mass and metabolic energy, a key determinant of lifespan. Eor simplicity, we de¬ 
note these 19 diets asG = {21, 22, 23, 24, 27, 28, 29, 34,35, 39,42,43, 44,45,48, 53, 55,63, 99}. 


Similar to Section 2.4, the days on diet were rescaled prior to analysis. Eigure [2aj shows 
the mean weight trajectories for the 19 diet groups. Note that the mean weight estimates 
become unreliable as days on diet increases since mice died off in the process. 

As previously pointed out, since the subjects were on the same diet at t = 0, it is 
reasonable to assume the same intercept for all groups. The individual weight trajectories 
suggest that, unlike the simulated dataset, a quadratic term is needed to characterize the 
trajectories. Specifically, we re-write the LMM from Q as 

= OiQ + ait + a2t^ + Pg,l{'yg,l)t + g ,2{lg ,2)t^ + ^0,i + 9^0- (14) 


Given the quantity of the data, it is possible to propose more complex models with additional 
polynomial terms. However, the additional terms complicate the model and the additional 


coefficients are difficult to interpret scientifically. Hence, we considered the model at (14) 
as our full model. 
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Priors were specified as in the simulation study and 799 was set to 799 = (0, • • • , 0)^ as 
the baseline model. The Gibbs sampler was terminated by the relative standard deviation 
FWSR with e = 0.124 and 6 = 0.05, resulting in 115792 iterations with at least 1000 
effective samples for estimation of posterior mean of the parameters related to fixed-effects 
and variance components. The resulting MCMC outputs show the chain is mixing well 
and indicate that the variation among subjects outweighs the variance of the measurement 
errors. 


Table [^contains the posterior estimates from the proposed model, along with the MLE 
estimates from the R package lme4 for the LMM. The results from the proposed method are 
comparable to the MLEs in terms of point and interval estimates. However, the marginal 
inclusion probability from the proposed method provides a direct measure of the signifi¬ 
cance for each diet, which was unavailable in previous investigations using the frequentist 
approach. 


Figure shows the fitted weight trajectories colored based on the magnitude of their 
marginal inclusion probabilities. We can see that the treatment groups that have weight 
trajectories similar to the control group are the ones with lower inclusion probabilities. It 
shows that the marginal inclusion probability behaves well as a measure of the difference 
between a treatment diet and the control diet. Moreover, we find the suggested threshold 
0.8772 is a reasonable value to classify treatment diets into significantly/insignificantly 
different from the control diet (see Figure and Figure [31^. 


As an example, we compare four diets (Diet No.21 - Diet No.24) that are supplemented 
with 1.5, 2.5, 3.5, or 4.5 Nordihydroguaiaretic Acid (NDGA)/kg diet ( [Spindler et al. 2014c). 
Figure shows the mean weight trajectories and Figure |M| shows the fitted weight trajec¬ 
tories for the 4 diets supplemented with NDGA and the control diet. We find that the fitted 
trajectories correctly capture the characteristics of the mean weight trajectories for each 
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Fitted Weight Trajectories (insigniticant) 



(a) Estimated weight trajectories for treat¬ 
ment diets classified into insignificant and the 
control diet. 


Fitted Weight Trajectories (significant) 



(b) Estimated weight trajectories for treat¬ 
ment diets classified into significant and the 
control diet. 


Mean Weight Trajectories (NDGA diets) 



Rescaled Days on Diet 


Fitted Weight Trajectories (NDGA diets) 



(c) Mean weight trajectories for 4 NDGA 
supplemented diets and the control diet. 


(d) Estimated weight trajectories for 4 
NDGA supplemented diets and the control 
diet. 


Figure 3: Analysis of the experimental dataset based on the proposed model. 


diet, especially for the first half of the experiment when most mice were alive. These actual 
and fitted weight trajectories indicate that NDGA produced a dose-responsive decrease in 
body weight in the absence of a change in food consumption. These data suggest NDGA 
may have extend mouse lifespan by decreasing calorie absorption, inducing a state of caloric 
restriction, or by increasing metabolic rate. Further experiments will be required to resolve 
these possibilities. 


4 Discussion 


This article proposes a novel method for Bayesian variable selection on LMM to compare 
multiple treatments with a control. It is built upon a modification of the fractional prior 


proposed by Smith and Kohn (1997) and a component-wise Gibbs sampler. It provides 
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practitioners with a framework to incorporate prior knowledge of each treatment, as well 
as an intuitive evaluation of its significance. This method is quite general and has a wide 
range of potential applications in helds such as biology and medicine. 

The proposed method is advantageous in that multiple treatments are compared to a 
control group simultaneously. In addition, the Bayesian framework introduces marginal 
inclusion probabilities for each group that allow direct measure of the significance of each 
treatment, which is difficult using alternative frequentist approaches. Notice that, we in¬ 
troduce a vector 7 ^ = ( 7 g, 0 ) • • • j lg,p-iY' in © as indicator instead of a single 7 ^ for each 
group, because it allows a more in-depth comparison between two groups. In this paper, 
the application on the experimental dataset provides new insights for researchers to group 
and study the diets based on their levels of significance. 


We emphasize careful posterior inference when using MCMC methodology. One major 
challenge for practitioners is determining how long to run a simulation. While some simu¬ 
lations are so complex that a fixed time approach is the only practical one, this is not so for 


most experiments. We advocate the use of relative standard deviation FWSR (Flegal and 


Gong 2015; Gong and Flegal, 2015), since it is proved to be easy to use, theoretically valid 


and superior to using convergence diagnostics as a stopping criteria (Flegal et al. 2008 
Jones et ah, 2006| . 
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A Appendix 


Full conditional posterior distributions are derived from ([^, for f = 1 , • • • g = 1, - ■ ■ ,G, 
and j = 0, • • • ,p — l. To calculate the full conditional posterior Q{'yg,j\ct,'y-{g,j), b, y), we 
integrate out /3 in ([^ as Smith and Kohn (1996), 


Smith and Kohn (1996 


q{a,-f,b,a^,XD\y) = / q{a, P,'y,b,a^, XD\y)d/3 

G r 


(X 


7r(a)7r(A£))7r((T^) 


g=l L jgg J 

G 

^ n /, Up^y i\ot,l3g,'yg,bi,XD,cr‘^)7J:il3g\a,jg,b,a^)dPg 

g=l Pg i^g 


(15) 
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To calculate (15), define (f>i = Vi — Wia — Zibi. For a given g, consider 




Y{piyi\a, Ad, cj2)7r(y8g|a:,7g, 6, a^)dPg 


9 ieg 


(X 


Ip, 


i&g 


C7 exp <1 ((/)i - Xi{'rg)pg{'rg) ) Xi{lg)^g{lg) 


a TLi 

i&g 


2 ^7=0 '9,3 


X exp 


= a ^'69 "■* 


Pailg) - ( i'lg)M'1g^ { Y. 

\ V- J \ V- 

^ zGg ^ ^ iGg 

- (E (7,)Xi(7»)) ‘ (E 

^ *65 * ^ ^ iGg ^ J I 

IEia;t^l(7»)^i(77)l y 

Eia{l + i-)AT(7j)A-.(7,)l ) 


a‘ 


Y,xTi'yg)x,ijg) 


i&g 




g=i L ieg 


i£g 

T 


-1 


+ ( Y ( Y -^Tilg)M'rg)] f Y 


i&g 


i&g 


i&g 


1^(1 + -)^f hg)^i ) ( iy9)Mlg) ) ( {'tg)<t> 

'H / \ . _ 'H / \ . _ 'O' 


ieg 


i&g 


Therefore, (15) is further simplified 


^np/2 


exp{ 


Ap 

~Y 


EE^iwnn- 


3=1 i&g 3=1 j=l 
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A^i“^exp{-d2Ap} JJ 

9=1 
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X exp 
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9=1 


Y^^^^ + [Y 

i£g ^ i&g 


g'’'(l - TTp)' ^7.7exp{-l(Q: -rf3)7d4(a - da)} 
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Based on (16), the full posterior distribution is characterized by 


( \ u 2 \ i (7g)^*(7g)l 


X exp 


2 u 2 


eg'' ' rii 

r / 1 \ -1 


(7g)^i(7g) ) ( {'ig)(i>. 

_ i^g ^ i^g ^ ^ ^ ^ ^ ^ ^ 


+ ^)Xl (Jg}^) ( ^(1 + ^)Xj {'rg)Xi{jg)] ( ^(1 + ^)Xf {'ig)(l>i 

._ 'H / \ V- / \ V- 

z£g ^ ^ i£g ' ^ t£g 


where nf-igj) = (7g,0, • • • , Tgj-i, 7gj+i,''' , l 9 ,p-i? 
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Table 2: Fixed-effects estimates for the experimental dataset. For MLE, mean and 95% 
confidence interval (Cl) are presented. For posterior inference, posterior mean, 95% credible 
interval (Cl) and marginal inclusion probability (standard error in the parenthesis) are 
presented. 




MLE 


Posterior 


Parameter 

Mean 

95% Cl 

Mean 

95% Cl 

Pr(7.a,.7 = My) 

ao 

44.879 

(44.638, 45.120) 

44.903 

(44.720, 45.086) 


ai 

-4.124 

(-4.864, -3.384) 

-3.687 

(-4.134, -3.237) 


Oi2 

-0.992 

(-1.444, -0.540) 

-1.718 

(-2.076, -1.369) 



-3.671 

(-6.637, -0.705) 

-3.800 

(-6.202, -1.432) 

0.991(2.13e-4) 

1 ^ 21,2 

1.244 

(-0.553, 3.041) 

1.667 

(-0.151, 3.475) 

0.809(1.15e-3) 

1^22,1 

-7.894 

(-10.828, -4.959) 

-8.069 

(-10.353, -5.789) 

0.996(1.22e-5) 

1^22,2 

4.429 

(2.644, 6.214) 

4.611 

(2.780, 6.461) 

0.996(8.63e-6) 

/323,1 

-4.467 

(-7.282, -1.652) 

-4.902 

(-6.995, -2.810) 

0.996(1.73e-5) 

/^23,2 

1.819 

(0.203, 3.435) 

2.352 

(0.638, 4.046) 

0.970(4.75e-4) 

1 ^ 24,1 

-12.373 

(-15.317, -9.428) 

-12.453 

(-14.795, -10.103) 

0.996(8.64e-6) 

1 ^ 24,2 

5.349 

(3.533, 7.165) 

5.596 

(3.752, 7.444) 

0.996(8.64-e6) 

/527,1 

-6.643 

(-8.694, -4.592) 

-5.135 

(-6.671, -3.570) 

0.996(8.64e-6) 

/327,2 

3.173 

(1.948, 4.397) 

3.098 

(1.872, 4.344) 

0.996(8.64e-6) 

/528,1 

2.717 

(0.585, 4.849) 

3.899 

(2.227, 5.611) 

0.997(1.50e-5) 

1^28,2 

-0.969 

(-2.273, 0.335) 

-1.047 

(-2.423, 0.316) 

0.747(1.27e-3) 

1^29,1 

-13.462 

(-15.574,-11.350) 

-12.333 

(-13.946,-10.704) 

0.996(8.64e-6) 

1^29,2 

6.499 

(5.195, 7.803) 

6.432 

(5.097, 7.751) 

0.996(8.64e-6) 

1^34,1 

-0.768 

(-2.898, 1.363) 

-0.892 

(-2.441, 0.638) 

0.663(1.39e-3) 

1^34,2 

0.351 

(-1.002, 1.704) 

0.022 

(-1.315, 1.459) 

0.592(1.44e-3) 

(^5,1 

-1.552 

(-3.607, 0.503) 

-2.355 

(-3.911,-0.752) 

0.983(3.33e-4) 

1^35,2 

0.563 

(-0.670, 1.796) 

1.384 

(0.140, 2.657) 

0.920(7.83e-4) 

1^39,1 

3.183 

(0.955, 5.410) 

2.699 

(0.829, 4.532) 

0.975(4.29e-4) 

1^39,2 

-2.422 

(-3.877, -0.967) 

-2.212 

(-3.707, -0.659) 

0.976(4.16e-4) 

1^42,1 

3.625 

(1.560, 5.689) 

3.478 

(1.874, 5.054) 

0.996(2.86e-5) 

1 ^ 42,2 

-2.047 

(-3.298, -0.795) 

-1.686 

(-2.953, -0.423) 

0.963(5.28e-4) 

1 ^ 43,1 

1.627 

(-0.442, 3.697) 

1.298 

(-0.201, 2.902) 

0.815(1.13e-3) 

1^43,2 

-0.659 

(-1.910, 0.591) 

-0.364 

(-1.705, 0.929) 

0.584(1.45e-3) 

1344,1 

-2.230 

(-4.532, 0.068) 

-1.028 

(-2.777, 0.626) 

0.675(1.37e-3) 

1344,2 

0.853 

(-0.626, 2.331) 

0.274 

(-1.195, 1.889) 

0.549(1.46e-3) 

/345,1 

1.094 

(-1.122, 3.310) 

1.357 

(-0.351, 3.196) 

0.772(1.23e-3) 

1345,2 

-0.160 

(-1.631, 1.310) 

-0.696 

(-2.308, 0.875) 

0.633(1.41e-3) 

[348,1 

4.118 

(2.032, 6.204) 

3.361 

(1.695, 5.014) 

0.996(6.29e-5) 

[348,2 

-2.268 

(-3.541, -0.995) 

-1.728 

(-3.020, -0.439) 

0.960(5.48e-4) 

[353,1 

-1.138 

(-3.242, 0.966) 

-1.381 

(-2.955, 0.185) 

0.813(1.14e-3) 

[353,2 

0.047 

(-1.343, 1.249) 

0.176 

(-1.181, 1.574) 

0.530(1.47e-3) 

^55,1 

-1.714 

(-3.800, 0.372) 

-2.442 

(-4.072,-0.812) 

0.983(3.27e-4) 

[355,2 

0.695 

(-0.584, 1.975) 

1.356 

(0.061, 2.658) 

0.894(8.92e-4) 

[353,1 

-2.483 

(-4.590, -0.376) 

2.712 

(1.052, 4.376) 

0.991(2.07e-4) 

1 ^ 63,2 

-1.367 

(-2.678, -0.056) 

-1.305 

(-2.580, 0.019) 

0.863(1.OOe-3) 
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